-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use as of system time when listing tables to improve performance #565
use as of system time when listing tables to improve performance #565
Conversation
@SpringBootTest(classes = {PolarisPersistenceConfig.class}) | ||
@ActiveProfiles(profiles = {"polaris_functional_test"}) | ||
@AutoConfigureDataJpa | ||
public class PolarisConnectorTableServiceFunctionalTest extends PolarisConnectorTableServiceTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for changing from h2db to postgresql and adding these functional tests for both the table service and store service
|
||
try { | ||
// pause execution for 10000 milliseconds (10 seconds) | ||
Thread.sleep(10000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose of sleep to wait for the create to complete, the create call should be synchronous and return the created object after it is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are now only reading the snapshot that is follower_reader_timestamp (4.8s) ago, we need to sleep for sometime so the following list operation can read the 3 tables created.
public List<PolarisTableEntity> findAllTablesByDbNameAndTablePrefix( | ||
final String dbName, final String tableNamePrefix, final int pageFetchSize) { | ||
Pageable page = PageRequest.of(0, pageFetchSize, Sort.by("tbl_name").ascending()); | ||
entityManager.createNativeQuery("SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp()") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entityManager
is an instance variable in this class so if multiple threads are all running for this api at the same time, would there be situations where they overwrite the timestamp set by each other, since follower_read_timestamp()
is a crdb function that is evaluated to a value upon execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the entityManager is annotated with @PersistenceContext
it's scoped within a transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info and reference, upon closer look seems like spring will inject a special proxy for the entityManager
in this setup so indeed should be safe
https://github.com/spring-projects/spring-framework/blob/main/spring-orm/src/main/java/org/springframework/orm/jpa/SharedEntityManagerCreator.java
// pause execution for 10000 milliseconds (10 seconds) | ||
Thread.sleep(10000); | ||
} catch (InterruptedException e) { | ||
System.out.println("Sleep was interrupted"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: use logger.
public List<PolarisTableEntity> findAllTablesByDbNameAndTablePrefix( | ||
final String dbName, final String tableNamePrefix, final int pageFetchSize) { | ||
Pageable page = PageRequest.of(0, pageFetchSize, Sort.by("tbl_name").ascending()); | ||
entityManager.createNativeQuery("SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp()") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the entityManager is annotated with @PersistenceContext
it's scoped within a transaction.
Context:
In the context of Polaris, some database may have large number of tables, and when listing them, crdb will hold the lock and prevents other db write operations to be performed, leading to high latency.
Blocker
Initially, I thought we can simply include the AS OF SYSTEM in @query for JPA, which turns out to be not the case.
See this commit for reference. Basically it will throw
Caused by: org.postgresql.util.PSQLException: ERROR: AS OF SYSTEM TIME: only constant expressions or follower_read_timestamp are allowed
, and looks like jpa treat the passed in param as dynamic expressions.Alternative
Use entityManager to execute the native query instead.
What are some major changes